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Abstract 

In areas such as kernel smoothing and non-parametric regression 
there is emphasis on smooth interpolation and smooth statistical mod- 
els. Splines are known to have optimal smoothness properties in 
one and higher dimensions. It is shown, with special attention to 
polynomial models, that smooth interpolators can be constructed by 
first extending the monomial basis and then minimising a measure of 
smoothness with respect to the free parameters in the extended basis. 
Algebraic methods are a help in choosing the extended basis which 
can also be found as a saturated basis for an extended experimental 
design with dummy design points. One can get arbitrarily close to 
optimal smoothing for any dimension and over any region, giving a 
simple alternative models of spline type. The relationship to splines 
is shown in one and two dimensions. A case study is given which 
includes benchmarking against kriging methods. 



1 Introduction 



There is a considerable literature on smooth interpolation and its statistical 
counterparts. The area of non-parametric regression is an example. The 
optimal smoothness properties of splines have a su bstantial literatu re. The 
optimaUty result for one dimensions is attributed to iHoUadavl |1957| an d for 



two dimensio ns, where thin-plate splines are optimal, tojDuchon 



Miculal |2002[ | for a nice review on spline optimality and lKimeldorf and Wahba 



1976 



see 
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[l970l for an overview. In computer experiments Bayesian kriging using 
Gaussian kernel s tochastic process mode l s has been preferred to splines, 



Sacks et al.l |1989l |. [Kennedy and O'HaganI [200l|. and have also become pop 



ular in machine learning: iRasmussen and Williamsl [2005[. Of course, the 



connection between kriging and spline is thoroughly researched and, for ex- 
ample, splines can arise as kriging (co nditional expectation) interpo lators for 



special Gaussian stochastic processes: Kimeldorf and Wahba 197d . 



Raw polynomial interpolation is known in general not to have optimal 
rates of interpolation unless special sampling (design) points are used such as 
in Tchebychev approximation. On the other hand the existence of polynomial 
interpolators over an arbitrary design is at the core of the newer theory of 
algebraic statistics: for any arbitrary design in d dimensions there is always 
a monomial basis out of which we can build a polynomial interpolator. This 



was introduced in to statistics bylPistqne and Wynd |1996| . cover ed at length 



in the monograph iPistone et al.l |2001| and was also the basis for iBates et al 



2003l | which can be seen as the forerunner of the present paper. 



The aim of the present paper is to try to have the best of both worlds: to 
draw a little on the algebraic theory but principally to show, in an rather ele- 
mentary way, how to construct smooth polynomial interpolators or statistical 
models. This is achieved by extending the model basis and using this free- 
dom to optimise a measure of smoothness. It should be pointed out that the 
use of polynomials to b uild kernels with pre-specified properties is familiar 



in signal processing, see iLin et al.l |2004j . By extending the model basis we 



can show that our interpolators get arbitrarily close to optimal interpolators, 
which are typically in the spline family. 



1.1 Monomial bases and extended bases 

Recent work in the area of "algebraic statistics" shows how to construct es- 
timable (identifiable) monomial bases for polynomial regression and we start 
with a very short description. Having said this, it is not necessary to use these 
methods, nor indeed to use polynomials. For example a Fourier (trigonomet- 
ric) basis may be used. The point is that we shall need an extended basis 
with certain conditions and the algebra is one way of achieving this. 

We start with a set of factors x = {xi, . . . ,Xd). For a set of non- 
negative integers a = {ai, . . . ,ad), a monomial, such as xlx2, is written 
and a polynomial is a linear combination of monomials. A 
design D„ is a set of n distinct points in d dimensions, D„ = {x^^\ . . . , x^"'^}, 
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G = 1, . . . ,n. 



The algebraic methods give us the following: given an experimental de- 
sign, Dn, it is always possible to find a saturated non-singular monomial basis 
Bl = {xa, a E L}. Thus, the size of the basis is equal to the size of the design 
\L\ = \Dn\ = n and the n x n X- matrix, X = {x°'}x£D„,a(^L is non-singular. 
We call such a basis a good saturated basis for the design. The intuition be- 
hind algebraic methods is simple: terms are included in the good saturated 
basis according to a te rm ordering and a rank inclusion criterion. For details 
on term orderi ngs see ICox et al.l 1997l |. and for description of the algebraic 



technology see iPistone et al.l |2001 



Example 1 Let D24 to be the first 24 points of a bidimensional Sobol's space 
filling sequence. An irn plementation of the description of SoboF sequence by 
Bratlev and Foxl 1988l | is available in the language R, see llhaka and Gentleman 



19961 . Then by selecting terms with a degree lexicographic term order 



xi y X2, 



a good saturated basis with 24 monomials is identified for D24,. 
This model includes the monomials 2 plus all the terms of a 

model of total degree five. This basis will be extended in the example of 
Section 13. 2[ 



It will be critical in our development that we may extend a basis. By 
this we mean we keep the design Dn fixed but take a larger set of N > n 
monomials, hence the term "supersaturated" in the title if the paper. But 
we require a condition contained in the following definition. 

Definition 1 Given a design Dn, with sample size n, a good supersaturated 
basis is a basis Bm = {a;°, a G M} with \B\ = N > n such that there is a 
hierarchical non- singular sub-basis of size n. 

Here is an example to show that we have to be a little careful. Let us start 
with a rather poor design in two dimensions: D^ = {(0, 0), (1, 1), (2, 2), (3, 3)}. 
Then, and this is obvious without any algebra, there are only two good satu- 
rated model bases {l,Xi,x1,xl} or {l,X2,xl,xl}. From this we can see that 
the extended basis {l,xi,xl,X2,xl} with five terms is not good as there is 
no good sub-basis of size four. 

If we start with a non-singular basis for a design Dn and extend it, in 
any way, then we always obtain a good supersaturated basis. But there 
is a revealing way of generating a good supersaturated basis and that is 
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by extending the design D„ to a design with N points and finding a 
good saturated basis for larger design, which contains the good basis for Dn- 
The algebra shows that this is always possible. This leads to a second, and 
equivalent, way of producing the smooth models which will be called the 
"dummy design" method, covered in sub-section 2.2. 

2 Smooth interpolators 

The basic idea of this paper may seem at first to be somewhat contradictory. 
We start with given polynomial interpolator and by extending the basis make 
the interpolator smoother. Although one may naturally associates higher 
order polynomial terms with lack of smoothness, we can, in fact, extend the 
basis and use the freedom this gives to increase smoothness. 

Let the experimental design be Dn and . . . ,y„ be real values (obser- 
vations) at the design points x*^*) G Dn, i = 1, . . . ,n, respectively. Let Bm be 
a good supersaturated basis for the design Dn and let 



be a polynomial in that basis. A good supersaturated model will be sought 
for using a measure of smoothness. 

In one dimension {d — 1) we shall adopt the following measure of smooth- 
ness based on the second derivative 



where the integration is carried out in a desired region A' C IR. For higher 
dimensions the Hessian is 




(1) 




(2) 




and we have 



4^ \dxidxjj 



H{y{x))\f = tr3,ce {H{y{x)f) . 



(3) 



Then define 




(4) 
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for some desired region A" C ffi . 

The smooth interpolator is y{x) = J2aeM ^aX°', where the coefficients 9a 
are selected to minimise smoothness subject to the interpolation condition, 
i.e. solving the constrained optimisation problem 

min '^2(y(x)) subject to = y{x^'^^), i — 1, . . . ,n (5) 



In the next subsection we give the solution of this constrained problem and 
the in the second subsection give the dummy design method, which is equiv- 
alent. 



2.1 The constrained problem 

The only technical difficulty arises from the fact that linear parts of the model 
make no difference to the criterion ^2 but do affect the interpolation. It is 

necessary to partition the X-matrix to take account of this. 

Let f{x) and 6 respectively be the vectors which hold the good supersat- 
urated basis and the parameters so that we can write (1) as y{x) = 6^ f{x). 
Denote f'-^^^ = f ■^i'^^ and define 

K = J^(^i2j^''^f'^''^^^dx, (6) 



Then we see that 



^2{y{x)) = 9^ KB (7) 



The technical difficulty mention above means that K may not be full rank. 
In particular any linear term in the models basis will give zero entries. Call 
this entries structural zeros. Permute the rows and columns of K so that the 
structural zeros are adjacent: 



K 




K 



(8) 



Let X = [Xo, Xi], / = (/o^ : f^f and 9 ^ {91 : 9ff be the corresponding 

rearranged and partitioned versions of Xn, f and 9, respectively. The matrix 
X has n rows and as many columns as terms in /. Let y, be the column 
vector with n observations and note that ^2 = 9jK9i. 
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With this the constrained quadratic problem (5) is: 



min 9jK9i subject to Xq9q + XiOi = y 



(9) 



Let 2A be an n X 1 vector of Lagrange multiphers (2 is for convenience) so 
that the Lagrangian is 

elkei-2\{Xoeo + Xi9i). 

After differentiation the full set of equations for 6o,Oi and A can be written 
in block form 

Xq Xi 
K 






■-0 









' y ' 













A 








(10) 



If the matrix on the left hand side is nonsingular we obtain a unique solution 
00,01, A. The following three conditions guarantee this. 

(i) The full basis is a good supersaturated basis for Dn so that X is full 
rank. 

(ii) Xq is full rank. 

(iii) K is full rank and thus invertible. 

The full matrix inverse with solutions 0q,0i, X are given in Appendix L 
Finally, using these results, we express the smooth estimator as 

m = Oofo+0ifi = 0f{x) 

and the optimal \l/2 as 

= 0jK0\. 

In applications, as is common with quadratic programme, we simply invert 
the matrix on the right hand side of (9) using a fast numerical method. Thus, 
given the design Dn, the good supersaturated basis and K, the method is 
fairly straightforward to implement. 

It is revealing to consider the case where K is nonsingular. Then we do 
not need the partition of Equation ([HD and instead can write Equation flTUl) 
as 



" X 


" 




' ' 




y 


k 


-X 




A 








Which has the solution: 

= (X^X + K{I - P)K)-^X^y 
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where P = X'^{XX'^)~^X is the projector onto the row space of X. Thus, 
ahhough X^X is not invertible, because we have a supersaturated model, 
the second term K{I — P)K on the left hand side can be seen as a smoothness 
induced regularisation of the problem which compensates for this singularity. 

2.2 The dummy design method 

For simplicity of development we assume that K is non-singular in the present 
case. Let he a. large design, with N > n distinct points, which contains 
the original design D„ and write 



where q = N — n. Let h{x) be a good saturated basis for Z}„, and let f{x) 
be an (extended) good saturated basis for Z^jy? f{x) = {h{x)'^ , g{x)'^)'^ . Also 
extend the observation vector to ^ = {y^, z^Y where, as before y holds the 
"true" observations taken at points in D„, and z can be thought of as dummy 
observations on the design Dq. The extended model we write 



and assume, as in the last section, that y{x) interpolates the observations y 
over Dn- 

We now minimize ^'2 over the the choice of dummy observations z which 
is now an unconstrained optimization problem, but with a reduced set of 
free parameters, namely z. The constrained optimization (8) and this un- 
constrained optimization are equivalent in the case that the full basis is a 
good for the full design, D^. This is because of the one to one correspondence 
between observations and parameters and the fact that the interpolation con- 
straint is the same in both cases. 

The unconstrained problem is: 



Where Xjv is the X-matrix for the full large model f{x). First, let 
the following matrix be partitioned according to the model bases f{x) — 



Dn 



y{x) = f{xfe = h^ix)/3 + g^{x)^ 



(11) 



min {y^:z^)X^'^KX^' ©. 



(12) 



{h{xf,g{xrf: 
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Then after expanding (11) and differentiating, the optimal z is 



^ = -^22 M\V 



and the minimum value of the smoothness is 



where Q — An — 



^22^^21- The smooth interpolator is 



is the appropriate partition of Xjv, i.e. the rows of Xn are indexed by Dn 
and Dg, while the columns are indexed by h{x) and g{x). 

The last equahty and the equivalence to the solution in the last subsection 
is shown for the case that K is non-singular. The equivalence in general holds 
under conditions (i), (ii) and (iii) in that section. We not that the solution 
does do not depend on the dummy design Dg, except in so far as it is involved 
in guaranteeing that we have a good supersaturated basis. 

3 One and two dimensions 

3.1 A one dimensional example: spline- like behavior 

In this example, smooth saturated models are used for interpolating a known 
univariate function. The function considered is the sine cardinal 



with a — 15n/2 and b — —lOn/2. The region over which the interpolators 
will be smoothed is X — [0, 1]. 

Suppose that the design Dq is a uniform design in [0, 1], and that the re- 
sponse vector y contains the values of m{x) at points in D^. The choice of a 
good saturated and supersaturated models can be driven by algebraic meth- 
ods. For the present case, an obvious candidate is h{x) — {l,x, . . . ,x^Y . 



(13) 



where 




m{x) — sinc(ax -\-h) — sin(aa; + h)/ {ax + b) 
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Call yQ to the interpolator fitted solely with h{x). Now a process of smooth- 
ing is carried out by adding dummy points, one at a time. While adding 
dummy points h{x) remains unchanged. With only one dummy point, a 
clear candidate for g{x) is g{x) = {x^), while for q dummy points, g{x) = 
{x^, . . . , x^~^'^~^) could be used. Call yg to the smooth interpolator obtained by 
adding q dummy points, g = 1, . . . , 5. The value of smoothness for yg quickly 
drops down so that a similar smoothness to that of a spline is achieved with 
^4 (only four extra terms) , see Table [H The progressive smoothing achieved 
with extra terms can be appreciated graphically as well. Figure [1] shows the 
interpolator and smooth saturated models. 




Figure 1: Sequence of smooth saturated models: yo is a polynomial of fifth 
degree (- -), yi, . . . ,y4 ( — ) are supersaturated models. True model m{x) (...) 
and design points are also shown. 



Model 


m 


m 


m 


^3 






Spline 


^2 


76.543 


74.698 


33.153 


33.020 


27.767 


27.745 


26.744 



Table 1: Value of \E'2 for supersaturated models interpolating m{x) over Dq 
of Section 13.11 

A comparison between the smooth supersaturated method and cubic 
splines, which are optimally smooth, is carried out as follows. First, for 
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a uniform design D„ on [0, 1], a saturated model yo is fitted to the values of 
m(x) at the design points. Call ^['2(0) the value of smoothness for {jq. Then, 
using extra q basis terms, a smooth supersaturated model i/q is fitted. Call 
"^2(1) corresponding value of smoothness. Additionally, a spline is fitted 
to the same data and call \l/2(sp) smoothness value. The important fea- 
ture is that the \E'2(0), \E'2(1), . . . form a decreasing sequence which converges 
surprisingly quick to \E'2(sp). This behavior can be quantified by plotting 
the ratio ^/^^(q)J^^ispj against the number of terms added to smooth the 
model. Figure [2] shows such comparison when Dn are uniform designs of size 
n = 5,10,15,20. 

lOi I 

Riq) ; 

5 ■. 
4 ■ 




Figure 2: Logarithm of smoothness ratio R{q) = ^/^^(q)J^^^{sp) against 
number of smoothing terms added q: sample sizes n = 5, 10, 15 (- — ). 
The line for n = 20 is indistinguishable from R{q) = 1. 



3.2 A two dimensional example: alternative to thin- 
plate splines? 

The objective of this example is to compare the performance of smooth su- 
persaturated interpolators against thin plate splines, but there is also interest 
to make comparisons against a kriging interpolator. Initially, interpolators 
of the three kinds above are constructed for a known function at given design 
points and then predictions over new design points are used to compare the 
performance of the interpolators. The known function is ■m{xi,X2), which is 
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constructed as m(xi, X2) = p(4xi — 2, 4x2 — 2), where p{xi,X2) is the peaks 
function from MATLAB(r). The objective of scahng and shifting p{xi,X2) is 
to include interesting features into the smoothing region X = [0, 1]^. 

In order to allow a good covering of the design region X without an 
excessive number of points, we use Sobol's space filling design D24 and h{x) 
to be the good saturated model of Example [H The response vector y contains 
the values of p{xi,X2) at points in D24,. 

A smooth supersaturated model was then fitted to this data using the 91 
terms of a good supersaturated complete model of degree twelve in xi,X2. 
Call this model y. A thin plate spline interpolator model was also fitted to 
the same data, which we refer to as ysp- A kriging interpolator, y^r, was also 
fitted using the model 

Y{x)=l3 + Z{x), (14) 

where Z{x) is a stochastic process with exponential covariance structure, i.e. 
Cov(Z(r),Z(s)) = exp{ZLlS^\r^ - s^n. 




-9 -5-13 7 -9 -5-13 7 



Usp ykr 

Figure 3: Smooth supersaturated predictions y against spline ysp and kriging 
predictions ykr for the extra design points in Section 13. 2[ 

For comparison, a set of predictions were generated for each model at 
new design points. The new design points were the next 500 points from the 
SoboF sequence used for the first step. The predictions obtained with the 
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smooth supersaturated model y are closely correlated with those of the spline 
jjsp and the kriging y^j. models, see Figure [3] (a) and (b), only showing bias 
for low predicted values, especially when comparing with the kriging model. 
Additionally, the root mean square error (RMSE) was computed using the 
true values g{xi, X2) and the predictions for each of the three models at the 
extra design points. The values of RMSE for the smooth supersaturated, 
spline and kriging models are 1.117,1.009,0.640, respectively. This figures 
represent the 7.7%, 7.0% and 4.4% of the response range, respectively. The 
results show that the smooth supersaturated models are a good alternative 
to splines for interpolation, which can also be seen in Figure H] against the 
simulated response. 

4 From interpolators to statistical models 
4.1 Designs points versus knots 

The bulk of the development in this paper concerns the use of the smooth 
function as interpolators. However they can be used as statistical models in 
a straightforward way. Recall that the solution are of the form 

y{x) = e^fix) = y^Bfix) 

for the matrix B, in one of the equivalent forms in the development. We 
see that y{x) is linear in the observations y. The idea is to make y a free 
parameter, that is to change the role of y. Indeed we could relabel y as (p 
and write the model as 

y = fBf{x) 

The design point in D„ become knots and we are parameterizing the model 
by the values at the knots. This is somewhat familiar in splines. With this 
change we are free to fit the models using any regression, stepwise regression, 
penalised method etc we choose. There is no requirement to observe at the 
knots. But when we have carried out the fitting and write y instead of we 
have the level of smoothness achieved by replacing yhyyin our formula for 
\E'2. Moreover we are free to choose the location of the knots and the "real" 
experimental design at which to observe. In terms of the dummy design 
method, this amounts to a double- dummying: once for the knots and once 
for the smoothness; even before we actually take observations. 
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Fi gure 4: Smooth supersaturated y, spline ysp and kriging predictions ykr 
against true simulated values y for the extra design points in Section 13.21 



The function k{x) = Bf{x) can be considered as special kernels each 
with a value unity at a design point and zero at other design points and we 
can write the model as ^iki{x)yi when the yi are observations or, in the 
parametric case just described, as Ylii ki{x)4>i. 
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4.2 Optimal design: for estimation or smoothness 

We restrict the discussion to the case that K is non-singular, again for sim- 
phcity. Then 

n = y'^Qy = y'^iXK-^x'^y'y 

We first note that the design D„, via the design model matrix X. affects the 
value of the smoothness in the interpolation case, even without any statistical 
considerations. Given that we have to choose the design before we observe y 
one may consider that some measure of the size oiQ = {XK~^X'^)~^ may be 
important. We may borrow criteria from the optimal design of experiments 
and seek to minimize some function of Q. In the case that K is non-singular 
det{Q) may be used, but as pointed out, since K is not typically full rank, 
nor is Q. 

We consider a small example. Let n = 3, = 5 and d — 1 and take 
the saturated basis and let both the design interval and the 

integration interval be X be [—1, 1] . We need to minimize ^2 — y^Qy with 
respect to the choice of four design points in [—1, 1]. After some analysis it 
can be shown that the optimal design take the form {—1, —a, a, 1} for some 
positive a. As expected, because of the two hnear terms, the matrix Q has 
rank two. The largest eigenvalue of Q takes the value 

12(1 + a^) 
a2(l - 2a2 + a^) 

Minimisation of the largest eigenvalue of Q leads to an optimal value of 
a — l/2-\/— 3 -I- VT? 0.52988. Minimising the product of the eigenvalues 
of Q gives a 0.40570. 

In the case that the design becomes a set of knots we are free to 
choose the actual design points separately. If we fit using smooth supersatu- 
rated models this gives an optimal design problem with the kernels {kj} given 
above. Continuing with the above example and guessing that the D-optimal 
on [0, 1] for the optimally smooth kernels obtained by the first solution takes 
the form {—1, —b, 6, 1} we find that I?-optimal solution as 



b^^y 1925 + 175^17 - 35 "\/2785 + 480^17 w 0.43402, 

which can, indeed, be confirm to be the D-optimum design by checking against 
the Kiefer-Wolfowitz General Equivalence Theorem. One see that these are 
not the same as the optimal knots. 
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But now an attractive possibility arises. Optimal design experimental 
design for splines has received some attention in the literature, but it has 
been considered a somewhat intractable problem. Now, given that splines 
can be found as the limit of polynomial models it may be considered that 
optimal design for splines can be found approximately by taking smooth su- 
persaturated models with large bases, and using one of a number of optimum 
design algorithms to find the (approximate) solution. One exchanges a prob- 
lem of handling real splines analytically with that of high dimensional linear 
algebra. This will be the subject of further research. 

In the case that we are free to choose the knots and the design points 
separately, a conceptually simple approach, then, to carry out two separate 
separate optimal "design" problems one for knot placement for smoothness, 
as above, and a second for, say, Z)-optimality of the design points. 

It becomes conceptually harder if we wish to take into account smoothness 
and statistical precision in a joint analysis. One might seek to minimize 
some portmanteau criterion with respect to a simultaneous optimizations 
over design points and knots. If, moreover, \&o is a statistical criterion such as 
from D-optimality, we might take as a criterion some weighted combination: 

(1- A)^o + A^2 

As the y— values at the knots are now unknown parameters (pi, in a linear 
model we have that the true smoothness is \l/2 = (f^Q<P is non-linear in 0. 



5 A case study: Engine Emissions Data 

The performance of a smooth supersaturated model is evaluat ed against a 



krigin g model using the engine emissions data set analysed in iBates et al 



2003l |. This data set comes from a computer experiment and comprises 48 
observations in five factors A^, C, A, B and M. An extra set of 49 observa- 
tions is available for validation purposes. The smooth supersaturated model 
y is constructed with 100 terms fitted to the set of 48 observations. For 
this model, 48 ter ms correspond to the good saturated basis proposed in 



Bates et al.l . l2003l . Section 6.3], and this forms h{x). A set of 22 terms are 
added to complement missing terms of total degree three and then a set of 
extra 30 terms of total degree four were added. All the extra 52 terms de- 
scribed form g{x) and were added using a degree lexicographic order. Call 
lisp and {jkr to the spline and kriging models constructed with the first data 
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set. The kriging model jjkr was built with a five dimensional extension of the 
covariance structure used in Equation (HM . 




30 60^ 90 120 150 

Usp 




30 60 90 120 150 

Vkr 



Figure 5: Smooth supersaturated predictions (y) against spline (ysp) and 
kriging predictions (ykr) for the validation data set of Section [51 



In the validation stage, predictions at the extra 49 design points were 
built using the three models y,ysp and ykr- The values of RMSE for y,ysp 
and ykr are 5.844, 5.896 and 4.450 respectively, which respectively represent 
the 4.4%, 4.5% and 3.4% of the range of the response values. The smooth 
supersaturated model y compares well with both spline and kriging. Figure 
E] shows that the predictions with the smooth supersaturated model are also 
closely correlated to those obtained with spline and kriging models. Figure 
|6] also shows the smooth supersaturated model to be a good predictor of the 
true response. 



6 Discussion and further research 

We have tried to show in this paper that the simple idea of extending a ba- 
sis in regression and using the free parameters which that gives to increase 
smoothness give interpolators which have the same order of magnitude error 
as the two main alternative: splines and kriging. For smaller dimensions not 
too many additional additional basis terms are need to give a large decrease 
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30 60 90 120 150 

y 




30 60 90 120 150 

y 



30 60 90 120 150 

y 

Figure 6: True values {y) against smooth supersaturated predictions (y), 
spline {jjsp) and kriging predictions ijjkr) for the validation data set of Section 

El 



in accuracy. Although there is still work to be done on the theory it seems 
clear that one can get arbitrarily close to the theoretically smoothest func- 
tions, namely splines. Moreover this can be achieved for complex regions of 
integration and sets of observation points (designs), limited only by a rank 
condition. 

There a number of ways in which one can generalise or adapt these meth- 
ods, which we discuss briefly. 
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1. The same analysis will go through for weighted criteria: 



*2 = 




H{y{xmMx)dx, 



where w{x) is a non-negative weight function. This simply changes the 
definition of K and K. 

2. The smoothness criteria we adopted is one of a number in a wider 
quadratic class such as 



where A{y{x)) is the gradient vector. Another is the deviation from a 
target 



and one could have weighted versions of them or even weighted combi- 
nations of different criteria. 

3. We have ignored analysis based on building in additional, more sta- 
tistical criteria, such as cross-validation to have a trade off between 
smoothness and statistical variation. A simple way of taking this for- 
ward would be to consider smooth supersaturated as adding to the cat- 
alogue of kernels which are now studied in many fields such computer 
experiments, non-parametric regression, imagining, machine learning 
and signal processing. They would be candidates for analysis using 
stepwise methods, AIC, BIC, LASSO and so on. 

4. A possible advantage of the kernels we have developed is that their 
polynomial nature makes them more tractable than, say, splines in some 
circumstances; for example for differentiation in sensitivity analysis, 
error propagation or integration. 

5. We summarize that given detailed attention to computational issues, it 
is possible to develop optimal experimental designs for the high degree, 
but smooth, kernel models which arise from the present methods. As 
mentioned, this may be a way of tackling optimal design for complex 
regions. 
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6. The same methods can be apphed for other bases, for example Fourier 
bases in one and higher dimensions. Again as the basis order gets larger 
one will tend to the optimal spline-like kernels. For Fourier bases one 
can gain smoothness by using higher frequencies, in seeming, but not 
actual, contradiction to the Nyquist sample theorem. 

7 Appendix 

7.1 Appendix 1: solution for 6q and 6i 

It is possible to use block matrix inverse methods, but they are a little cum- 
bersome. We first find ^o- Writing the equations out we have 

Xoeo + x,e, = y 

K0i-XfX = 
XoX = 

Solving for A from the second two equations we have 

A = {XiK-'Xf + XoX^y'XiOi 
Using this to eliminate 9i from the first equation we have 

X^{X,K-^X^ + XoX^y^XoOo = X^{X^K-^X^ + XoX^y'y, 

giving 

= iX^{XiK-'X^ + XoX^)-^Xo)-'X^{XiK-'X^ + XoX^y^y, 
Writing y* — y — XqOq we obtain reduced matrix equation: 



' Xi " 




A 




' y* ' 


K -Xf 









.0 ^0^^ . 










Left multiplying by the transpose of the matrix on the left and inverting we 
have 

§1 = {X'[X^ + K{I - X'({XX'')-'X^)k)-^X^y* (15) 

Note that in the case that Xq and Xi have orthogonal columns we reduce to 
the standard form 

k = {XlXoY^X^y 
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This can be achieved by rewriting the supersaturated basis so that the terms 
with degree higher than hnear (degree one) are orthogonal to the hnear terms 
with respect to the design. Of course, the definition of K should be changed 
accordingly. 



7.2 Equivalence of forms in the case K nonsingular 

The three forms for 6 = By where B is one of the following: 
(i) B, = {XJX, + K{I - P)Ky^X^y 
(i) B2^K-\X^^,X^2YQy 

To show that Bi = B2 multiply both by Xj^Xi + K{I — P)K and note 
that PX^ = to obtain respectively and X^XR-^X^Q. But from the 
definition of Q and using block the partition inverse formula we see that that 
XK~^X^ — and we are done (reversing the steps). 

To show that B2 — -B3 we multiply both by X~^^K. Then B2 gives 

X-'''kK-\Xi,,Xi2)'^QQ-' = X-'^{Xi,,Xi2)^ = Q, 
and i?3 gives 

_ Mll-Ai2A22^A2l\^_l _ /'All-Al2A22^A2l\^-l _ (I\ 
- \A21-A22A-^A2i)^ ~ \ - W- 

Again, reversing the steps we obtain our result. 
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